Preposition Semantic Classification via Treebank and FrameNet
نویسندگان
چکیده
This paper reports on experiments in classifying the semantic role annotations assigned to prepositional phrases in both PENN TREEBANK (version II) and FRAMENET (version 0.75). In both cases, experiments are done to see how the prepositions can be classified given the dataset’s role inventory, using standard word-sense disambiguation features, such as the parts of speech of surrounding words, and collocations indicative of the particular roles. In addition to using traditional word collocations, the experiments incorporate class-based collocations in the form of WordNet hypernyms. Separate classifiers are produced for each preposition. For TreeBank, the wordcollocations achieve slightly better performance: 78.5% versus 77.4%. However, for FrameNet, the combined collocations achieve better performance: 70.3% versus 68.5% Furthermore when classifying all the TreeBanks prepositions together, the combined yields a noticable gain at 85.8% accuracy versus 81.3% for word-only collocations.
منابع مشابه
Exploiting Semantic Role Resources for Preposition Disambiguation
This article describes how semantic role resources can be exploited for preposition disambiguation. The main resources include the semantic role annotations provided by the Penn Treebank and FrameNet tagged corpora. The resources also include the assertions contained in the Factotum knowledge base, as well as information from Cyc and Conceptual Graphs. A common inventory is derived from these i...
متن کاملPreposition Disambiguation: Still a Problem
Considerable recent progress has been made in preposition disambiguation using the SemEval 2007 corpus, with results reaching accuracy of over 88 percent. However, with a new corpus of tagged instances, use of the models shows a decline in performance to around 43 percent. This suggests that recent efforts suffer from an out-of-domain problem. Detailed examination of the dimensions of this prob...
متن کاملThe Preposition Project
Prepositions are an important vehicle for indicating semantic roles. Their meanings are difficult to analyze and they are often discarded in processing text. The Preposition Project is designed to provide a comprehensive database of preposition senses suitable for use in natural language processing applications. In the project, prepositions in the FrameNet corpus are disambiguated using a sense...
متن کاملIntegrating Data from The Preposition Project into FrameNet
In the course of The Preposition Project, FrameNet sentences are used as instances to characteize preposition behavior. FrameNet sentences containing prepositional phrases beginning with a given preposition are presented to a lexicographer, whereupon the given preposition is tagged with a sense from a sense inventory derived from the Oxford Dictionary of English. For the 34 prepositions used in...
متن کاملCorpus Annotation within the French FrameNet: a Domain-by-domain Methodology
This paper reports on the development of a French FrameNet, within the ASFALDA project. While the first phase of the project focused on the development of a French set of frames and corresponding lexicon (Candito et al., 2014), this paper concentrates on the subsequent corpus annotation phase, which focused on four notional domains (commercial transactions, cognitive stances, causality and verb...
متن کامل